KIN482E Final Project: League of Legends ARAM Data Analysis¶

Data from https://www.kaggle.com/datasets/bryanchungweather/league-of-legends-aram-champion-status-sept-2023/data

Background Information¶

League of Legends (LoL) is a multiplayer online battle arena game created by Riot Games in 2009. It is a 5v5 game where each player controls a character out of the 165 characters from an isometric perspective. This project will specifically analyze data from one of LoL's gamemodes: ARAM (all random, all mid).

ARAM is a 5v5 team gamemode where the objective is to destroy the opposing team's home base (Nexus) after destroying their towers and inhibitor before the enemy destroys your Nexus (see image above). As a result, winning team fights by eliminating enemy players is the key to winning the game. Eliminated players must wait a cooldown period before they are respawned at their home base.

Winning team fights can be due to a variety of factors, such as the character that you play and the individual skill of the player.

ARAM_Map.png

XP, Gold, and CS¶

Your character get stronger by gaining experience (XP) and items in the shop using gold. XP and gold are passively gained over time, where XP is used to level-up your character and increases their base statistics (i.e., attack damage, total amount of health) and abilities. Gold can also be gained actively by killing enemy players and minions using your character's abilities. Creep score (CS) tells you how many minions you 'last-hit', meaning, you will get gold from the minion only if you deal the final blow. This is different than simply damaging a minion, where there is no reward.

Gold is important because it is used to buy items, which significantly increase your champion's base statistics!

ARAM_Minions.webp

Character 'Selection'¶

The catch to ARAM is that the character you play is randomly-selected, which makes it unique to other LoL gamemodes where you can freely pick which champion to play. All 10 players in ARAM's 5v5 gamemode are given a randomly selected champion out of the 165 champion pool. Players all get a unique champion.

Every player has the option to 're-roll', which means they can get another randomly selected champion from the remaining pool. If players choose to re-roll, their initial character will be moved into the 'Available Champions' pool, located on the top of the screen. This allows other players on the same team to have the option to switch their characters with one in the available champions pool. The opposing team cannot view these champions and have a separate pool for themselves.

As a result, characters aren't purely 'random' as there is some degree of 'selection' given re-roll and swapping mechanisms. However, every player's initially assigned champion is random.

We will explore if there are differences in match outcomes based on the champion you play.

ARAM_Champ.jpeg

Individual Skill¶

LoL ranks players using a tier system from Iron (lowest rank) to Challenger (highest rank). Your tier is determined by your in-game performance (i.e., KDA, CS). Players can 'rank-up' if they win games more often than they lose, and the opposite is true where players can 'rank-down'. When players queue for ARAM games, the LoL algorithm will match you with and against players of similar rank.

The explored dataset is a compilation of statistics for each individual 165 game characters. Since the data is an average of hundreds of thousands, sometimes millions of games, the analysis will assume that individual skill between players is a negligible factor.

ARAM_Ranks.jpeg

LoL Regions¶

LoL is an internationally popular game; hence, regional servers. There is no inter-regional competition (excluding the professional scene), which means players are isolated in their region and can only adapt to intra-regional competition. This may give rise to different play-styles that are region-specific. We will also see if the difference in play-style is significant enough to influence champion trends in ARAM. We have access to region-specific datasets from the following servers:

  • EUW: Europe West
  • NA: North America
  • KR: Korea
  • JP: Japan
  • TW: Taiwan, Hong Kong, and Macau

ARAM_Regions.png

Questions¶

  • What champions do players favour?
  • What champions are the best to play?
  • What kind (i.e., role/class/sub-class) of champion is the best to play?
  • Are variables (i.e., gold, KDA, win rate) correlated to each other?
  • Are there differences in playstyle and champions statistics across LoL regions?
  • Is there a difference in champion statistics in low versus high ranks?

Global ARAM Data¶

Let's take a look at some ARAM data from September 2023 for players from all LoL servers that rank from Iron to Masters.

In [1]:
# import packages
import numpy as np
import pandas as pd
import glob as glob
import matplotlib.pyplot as plt
import seaborn as sns

%config InlineBackend.figure_format = 'retina'

# import Global ARAM data
df = pd.read_csv('data/ARAM_Global_September_Status.csv')
df.sample(5)
Out[1]:
Champion Games played KDA Win Rate Pick Rate Ban Rate CS Gold
139 Qiyana 770,316 2.59:1 0.4325 0.0329 0 31.75 14,411
88 Leona 1,284,844 2.93:1 0.5200 0.0549 0 16.92 11,980
160 Olaf 430,800 2.6:1 0.5141 0.0184 0 48.45 14,035
75 Orianna 1,413,430 4.45:1 0.5233 0.0604 0 49.53 13,967
74 Amumu 1,426,008 2.75:1 0.5100 0.0610 0 20.83 12,362

Data Cleaning¶

To initialize the data frame, these functions are created to clean up unnecessary rows, rework some values, and add more information to each champion. The 'Ban Rate' row is redundant as the function of 'banning champions' is a feature excluded from ARAM. ':1' from KDA is also removed to make the value a simple float number.

In [2]:
# clean up data frame
def clean_dataframe(df):
    del df['Ban Rate']
    df['KDA'] = df['KDA'].apply(lambda x: x.strip(':1')).astype(float) 
    df.rename(columns={'Champion': 'Character'}, inplace=True)
    return df

df['Games played'] = pd.to_numeric(df['Games played'].str.replace(',', ''), errors='coerce')
df['Gold'] = pd.to_numeric(df['Gold'].str.replace(',', ''), errors='coerce')
df = clean_dataframe(df)
df.sample(5)

# function to add a new column with values
def add_column_values(df, column_name, value_list):
    if len(df) != len(value_list):
        raise ValueError("Length of DataFrame and value_list must be the same.")
    df[column_name] = value_list
    return df

Classification¶

Champions are classified by their set of abilities. For example, one champion may have abilities that allow them to be stealthy and deal lots of damage at once, which would likely be an 'Assassin' character, whereas another champion would have healing and protective abilities, which they would be classified as 'Support'.

There are many forms of classifying champions. 'Class' is the classification based on general abilities, and 'Sub-Class' an even more specific classification of abilities.

In [3]:
# Champions categorized by class
# https://leagueoflegends.fandom.com/wiki/Champion_classes

# Tanks
tanks = [
    'Alistar', 'Amumu', 'Gragas', 'Leona', 'Malphite', 'Maokai', 
    'Nautilus', 'Nunu & Willump', 'Ornn', 'Rammus', 'Rell', 'Sejuani',
    'Sion', 'Zac', 'Braum', 'Galio', 'K\'Sante', 'Poppy', 'Shen',
    'Tahm Kench', 'Taric'
]

# Fighters
fighters = [
    'Briar', 'Camille', 'Diana', 'Elise', 'Hecarim', 'Irelia', 'Jarvan IV',
    'Lee Sin', 'Olaf', 'Pantheon', 'Rek\'Sai', 'Renekton', 'Rengar', 'Skarner', 
    'Vi', 'Warwick', 'Wukong', 'Xin Zhao', 'Aatrox', 'Darius', 'Dr. Mundo', 'Garen', 
    'Illaoi', 'Mordekaiser', 'Nasus', 'Sett', 'Shyvana', 'Trundle', 'Udyr', 'Urgot', 
    'Volibear', 'Yorick'
]

# Slayers
slayers = [
    'Akali', 'Akshan', 'Diana', 'Ekko', 'Evelynn', 'Fizz', 'Kassadin', 'Katarina', 'Kha\'Zix',
    'Naafiri', 'Nocturne', 'Pyke', 'Qiyana', 'Rengar', 'Shaco', 'Talon', 'Yone', 'Zed', 
    'Bel\'Veth', 'Fiora', 'Gwen', 'Jax', 'K\'Sante', 'Kayn', 'Kled', 'Lillia', 'Master Yi',
    'Nilah', 'Riven', 'Sylas', 'Tryndamere', 'Viego', 'Yasuo', 'Yone'
]

# Mages
mages = [
    'Jayce', 'Lux', 'Varus', 'Vel\'Koz', 'Xerath', 'Ziggs', 'Anivia', 'Aurelion Sol', 'Cassiopeia',
    'Karthus', 'Malzahar', 'Rumble', 'Ryze', 'Swain', 'Taliyah', 'Viktor', 'Vladimir', 'Ahri',
    'Annie', 'Brand', 'Karma', 'LeBlanc', 'Lissandra', 'Lux', 'Neeko', 'Orianna', 'Seraphine', 
    'Sylas', 'Syndra', 'Twisted Fate', 'Veigar', 'Vex', 'Zoe'
]

# Marksmen (ADC - Attack Damage Carry)
marksmen = [
    'Akshan', 'Aphelios', 'Ashe', 'Caitlyn', 'Corki', 'Draven', 'Ezreal', 'Jhin', 'Jinx',
    'Kai\'Sa', 'Kalista', 'Kindred', 'Kog\'Maw', 'Lucian', 'Miss Fortune', 'Samira', 
    'Senna', 'Sivir', 'Tristana', 'Twitch', 'Varus', 'Vayne', 'Xayah', 'Zeri'
]

# Controllers
controllers = [
    'Bard', 'Blitzcrank', 'Ivern', 'Jhin', 'Morgana', 'Neeko', 'Pyke', 'Rakan', 'Thresh', 
    'Zyra', 'Janna', 'Karma', 'Lulu', 'Milio', 'Nami', 'Renata Glasc', 'Senna', 'Seraphine',
    'Sona', 'Soraka', 'Taric', 'Yuumi'
]

# Specialists
specialists = [
    'Azir', 'Cho\'Gath', 'Fiddlesticks', 'Gangplank', 'Gnar', 'Graves', 'Heimerdinger', 'Kayle',
    'Kennen', 'Nidalee', 'Quinn', 'Singed', 'Teemo', 'Zilean'
]
In [4]:
# Champions categorized by sub-class
# https://leagueoflegends.fandom.com/wiki/Champion_classes

vanguards = tanks[:14]
wardens = tanks[14:]
divers = fighters[:18]
juggernauts = fighters[18:]
assassins = slayers[:18]
skirmishers = slayers[18:]
artillery = mages[:6]
battlemages = mages[6:17]
burst = mages[17:]
catchers = controllers[:10]
enchanters = controllers[10:]
In [5]:
# List of League of Legends champions categorized by role
# https://leagueoflegends.fandom.com/wiki/List_of_champions_by_draft_position

# Top Laners
top_laners = [
    'Aatrox', 'Akali', 'Camille', 'Cho\'Gath', 'Darius', 'Dr. Mundo', 'Fiora', 'Gangplank', 'Garen', 
    'Gnar', 'Illaoi', 'Irelia', 'Jax', 'Jayce', 'K\'Sante', 'Kayle', 'Kennen', 'Kled', 
    'Malphite', 'Maokai', 'Mordekaiser', 'Nasus', 'Ornn', 'Poppy', 'Quinn', 
    'Renekton', 'Riven', 'Rumble', 'Sett', 'Shen', 'Singed', 'Sion', 'Swain', 
    'Tahm Kench', 'Teemo', 'Tryndamere', 'Urgot', 'Vayne', 'Vladimir', 'Volibear', 
    'Wukong', 'Yasuo', 'Yone','Yorick', 'Gwen'
]

# Junglers
junglers = [
    'Amumu', 'Bel\'Veth', 'Briar', 'Diana', 'Ekko', 'Elise', 'Evelynn', 'Fiddlesticks', 
    'Gragas', 'Graves', 'Hecarim', 'Ivern', 'Jarvan IV', 'Jax', 'Karthus', 'Kayn', 'Kha\'Zix', 
    'Kindred', 'Lee Sin', 'Lillia', 'Master Yi', 'Nidalee', 'Nocturne', 'Olaf', 'Rammus', 
    'Rek\'Sai', 'Rengar', 'Sejuani', 'Shaco', 'Shyvana', 'Skarner', 'Taliyah', 'Trundle', 
    'Udyr', 'Vi', 'Viego', 'Volibear', 'Xin Zhao', 'Warwick', 'Zac'
]

# Mid Laners
mid_laners = [
    'Ahri', 'Akali', 'Akshan', 'Anivia', 'Annie', 'Aurelion Sol', 'Azir', 'Cassiopeia',
    'Corki', 'Diana', 'Ekko', 'Fizz', 'Galio', 'Irelia', 'Kassadin', 'Katarina', 'LeBlanc', 'Lissandra', 
    'Lux', 'Malzahar', 'Naafiri', 'Neeko', 'Nunu & Willump', 'Orianna', 'Pantheon', 
    'Qiyana', 'Ryze', 'Swain', 'Sylas', 'Syndra', 'Talon', 'Twisted Fate', 'Veigar', 'Vel\'Koz', 
    'Vex', 'Viktor', 'Vladimir', 'Xerath', 'Yasuo', 'Yone',  'Zed', 'Zoe', 'Heimerdinger', 'Ziggs'
]

# ADC (Attack Damage Carry)
adcs = [
    'Aphelios', 'Ashe', 'Caitlyn', 'Draven', 'Ezreal', 'Jhin', 'Jinx',
    'Kai\'Sa', 'Kalista', 'Kog\'Maw', 'Lucian', 'Miss Fortune', 'Nilah', 
    'Samira', 'Seraphine', 'Sivir', 'Tristana', 'Twitch', 'Varus', 'Vayne', 'Xayah',
    'Yasuo', 'Zeri'
]

# Supports
supports = [
    'Alistar', 'Bard', 'Blitzcrank', 'Brand', 'Braum', 'Janna', 'Karma', 'Leona',
    'Lulu', 'Lux', 'Malphite', 'Milio', 'Morgana', 'Nami', 'Nautilus', 'Pantheon', 'Pyke', 
    'Rakan', 'Rell', 'Renata Glasc', 'Senna', 'Seraphine', 'Sona', 'Soraka', 'Swain', 
    'Taric', 'Thresh', 'Vel\'Koz', 'Xerath', 'Yuumi', 'Zilean', 'Zyra'
]
In [6]:
# Classify champions by 'difficulty'
# based on https://leagueoflegends.fandom.com/wiki/List_of_champions/Ratings
easy = [
    'Alistar', 'Amumu', 'Annie', 'Ashe', 'Blitzcrank', 'Caitlyn', 'Cho\'Gath', 'Diana', 'Dr. Mundo', 
    'Ezreal', 'Garen', 'Janna', 'Jarvan IV', 'Jax', 'Karma', 'Leona', 'Lux', 'Malphite', 'Malzahar', 
    'Maokai', 'Master Yi', 'Milio', 'Miss Fortune', 'Morgana', 'Naafiri', 'Nasus', 'Neeko', 'Nocturne', 
    'Nunu & Willump', 'Olaf', 'Pantheon', 'Rammus', 'Renekton', 'Seraphine', 'Shyvana', 'Sion', 
    'Skarner', 'Sona', 'Soraka', 'Tahm Kench', 'Teemo', 'Tristana', 'Trundle', 'Tryndamere', 'Udyr', 'Vi', 'Volibear', 
    'Warwick', 'Wukong', 'Xin Zhao', 'Yuumi', 'Zac'
]

moderate = [
    'Aatrox', 'Ahri', 'Akali', 'Bel\'Veth', 'Brand', 'Braum', 'Briar', 'Corki', 'Darius', 'Draven', 
    'Elise', 'Evelynn', 'Fiddlesticks', 'Fiora', 'Fizz', 'Galio', 'Gragas', 'Graves', 'Gwen', 
    'Hecarim', 'Heimerdinger', 'Illaoi', 'Irelia', 'Jayce', 'Jhin', 'Jinx', 'Kai\'Sa', 'Karthus', 
    'Kassadin', 'Katarina', 'Kayle', 'Kayn', 'Kennen', 'Kha\'Zix', 'Kled', 'Kog\'Maw', 'LeBlanc', 
    'Lee Sin', 'Lissandra', 'Lucian', 'Lulu', 'Mordekaiser', 'Nami', 'Nautilus', 'Nidalee', 'Orianna',
    'Ornn', 'Poppy', 'Pyke', 'Quinn', 'Rakan', 'Rek\'Sai', 'Rell', 'Renata Glasc', 'Rengar', 'Riven', 
    'Rumble', 'Ryze', 'Samira', 'Sejuani', 'Senna', 'Sett', 'Shaco', 'Shen', 'Singed', 'Sivir', 'Swain',
    'Syndra', 'Taliyah', 'Talon', 'Taric', 'Twisted Fate', 'Twitch', 'Urgot', 'Varus', 'Vayne', 'Veigar',
    'Vel\'Koz', 'Vex', 'Vladimir', 'Xayah', 'Xerath', 'Yorick', 'Zeri', 'Ziggs', 'Zilean', 'Zyra'
]

hard = [
    'Akshan', 'Anivia', 'Aphelios', 'Aurelion Sol', 'Azir', 'Bard', 'Camille', 'Cassiopeia', 'Ekko',
    'Gangplank', 'Gnar', 'Ivern', 'K\'Sante', 'Kalista', 'Kindred', 'Lillia', 'Nilah', 'Qiyana', 'Sylas',
    'Thresh', 'Viego', 'Viktor', 'Yasuo', 'Yone', 'Zed', 'Zoe'
]
In [7]:
# add 'Class' column
df = add_column_values(df, 'Class', [
    'Tank' if champ in tanks else
    'Fighter' if champ in fighters else
    'Slayer' if champ in slayers else
    'Mage' if champ in mages else
    'Marksman' if champ in marksmen else
    'Controller' if champ in controllers else
    'Specialist' if champ in specialists else
    'Unknown'
    for champ in df['Character']
])

# add 'Sub-Class' column
df = add_column_values(df, 'Sub-Class', [
    'Vanguard' if champ in vanguards else
    'Warden' if champ in wardens else
    'Diver' if champ in divers else
    'Juggernaut' if champ in juggernauts else
    'Assassin' if champ in assassins else
    'Skirmisher' if champ in skirmishers else
    'Artillery' if champ in artillery else
    'Battlemage' if champ in battlemages else
    'Burst' if champ in burst else
    'Catcher' if champ in catchers else
    'Enchanter' if champ in enchanters else
    'Marksmen' if champ in marksmen else
    'Specialist' if champ in specialists else
    'Unknown'
    for champ in df['Character']
])
    
# add 'Role' column
df = add_column_values(df, 'Role', [
    'Top' if champ in top_laners else
    'Jungle' if champ in junglers else
    'Mid' if champ in mid_laners else
    'ADC' if champ in adcs else
    'Support' if champ in supports else
    'Unknown'
    for champ in df['Character']
])

# add 'Difficulty' column
df = add_column_values(df, 'Difficulty', [
    'Easy' if champ in easy else
    'Moderate' if champ in moderate else
    'Hard' if champ in hard else
    'Unknown'
    for champ in df['Character']
])

# set index
df = df.set_index('Character')

df.sample(10)
Out[7]:
Games played KDA Win Rate Pick Rate CS Gold Class Sub-Class Role Difficulty
Character
Nami 1161044 5.49 0.5104 0.0496 13.11 12234 Controller Enchanter Support Moderate
Kalista 811699 3.00 0.4844 0.0347 69.74 14791 Marksman Marksmen ADC Hard
Yuumi 1159557 8.65 0.5080 0.0496 6.13 12080 Controller Enchanter Support Easy
Ziggs 1606507 3.98 0.5197 0.0687 55.84 13713 Mage Artillery Mid Moderate
Milio 746275 6.18 0.5248 0.0319 17.38 11906 Controller Enchanter Support Easy
Leona 1284844 2.93 0.5200 0.0549 16.92 11980 Tank Vanguard Support Easy
Pyke 2424475 3.08 0.4514 0.1037 11.90 14193 Slayer Assassin Support Moderate
Sejuani 756768 3.20 0.5079 0.0324 31.38 13037 Tank Vanguard Jungle Moderate
Shyvana 889933 2.96 0.5001 0.0380 47.60 13905 Fighter Juggernaut Jungle Easy
Caitlyn 2956512 3.52 0.5176 0.1264 64.52 14666 Marksman Marksmen ADC Easy

Variables¶

  • Character: each row is one of the 165 playable characters in the game
  • Games played: total amount of games played on this character
  • KDA: 'Kills, Deaths, Assists'. Number represents average kills and assists per death
  • Win Rate: average win rate of that character
  • Pick Rate: average rate of selecting the character
  • CS: average creep score during the game, 'last-hits' on minions
  • Gold: average total gold accumulated at the end of the game
  • Class: classification of characters based on abilities and skill set
  • Sub-Class: more specific classification of characters based on abilities and skill set
  • Role: the character's most commonly played position in LoL's regular game mode: summoner's rift
  • Difficulty: based on the complexity of the character's abilities and skill set

Summary Statistics¶

In [8]:
summary_stats = df.describe().round(3)
print('Summary Statistics')
display(summary_stats)
Summary Statistics
Games played KDA Win Rate Pick Rate CS Gold
count 165.000 165.00 165.000 165.000 165.000 165.000
mean 1417491.212 3.27 0.499 0.061 42.325 13797.242
std 643218.875 0.81 0.022 0.027 17.484 887.570
min 262592.000 2.07 0.396 0.011 6.130 11635.000
25% 930918.000 2.77 0.490 0.040 30.330 13329.000
50% 1357428.000 3.09 0.504 0.058 39.290 13916.000
75% 1789494.000 3.47 0.516 0.076 55.030 14401.000
max 2973632.000 8.65 0.536 0.127 91.400 15914.000

Popularity of Characters¶

If character selection followed a truly random model of equal probability, then the games played and pick-rate columns would be unvariable across characters. However, this is not the case due to re-roll and swapping mechanisms, allowing for some degree of character selection.

Games played refers to the amount of ARAM games that character has been picked. Pick rate is its normalized value across all characters.

In [9]:
# Top 10 Characters by games played
print('Top 10 Characters by games played')
df.nlargest(10, 'Games played')
Top 10 Characters by games played
Out[9]:
Games played KDA Win Rate Pick Rate CS Gold Class Sub-Class Role Difficulty
Character
Ezreal 2973632 3.79 0.4906 0.1271 61.26 14358 Marksman Marksmen ADC Easy
Miss Fortune 2965667 3.47 0.5021 0.1268 65.32 14707 Marksman Marksmen ADC Easy
Caitlyn 2956512 3.52 0.5176 0.1264 64.52 14666 Marksman Marksmen ADC Easy
Jhin 2907554 4.28 0.5155 0.1243 64.61 15073 Marksman Catcher ADC Moderate
Lux 2852818 4.90 0.5256 0.1220 50.79 13907 Mage Artillery Mid Easy
Ashe 2828154 3.59 0.4998 0.1209 53.94 13343 Marksman Marksmen ADC Easy
Jinx 2820274 3.46 0.5098 0.1206 71.34 14466 Marksman Marksmen ADC Moderate
Varus 2744959 3.83 0.5063 0.1174 56.89 14897 Mage Artillery ADC Moderate
Veigar 2690944 3.40 0.5206 0.1151 59.52 14017 Mage Burst Mid Moderate
Morgana 2639049 4.36 0.5151 0.1128 37.39 13435 Controller Catcher Support Easy

In the Top 10 most played characters on first glance, 7 are ADC, 2 are Mid, and 1 support in the Role column! Also, these characters are Easy or Moderate on the Difficulty scale.

In [10]:
role_xlabel = ['Top', 'Jungle', 'Mid', 'ADC', 'Support']

# Pick Rate by Role 
fig, axs = plt.subplots(ncols=2, figsize=(10, 5), sharey=True)

sns.boxplot(data=df, x='Role', ax=axs[0],
            y='Pick Rate', order=role_xlabel)
axs[0].set_title('Pick Rate by Role')

# Pick Rate by Difficulty
sns.set_style('white')
sns.pointplot(data=df, x='Difficulty', 
            y='Pick Rate', hue='Difficulty',
            join=False)
sns.stripplot(data=df, x='Difficulty', 
              y='Pick Rate', hue='Difficulty', alpha=0.3)
axs[1].set_title('Pick Rate by Difficulty')
axs[1].legend().set_visible(False)

fig.suptitle('Pick Rate Figures')
plt.tight_layout()
plt.show()

From these figures, it looks like there is lots of variability in character pick rate within the Role and Difficulty classifications. Even though ADC characters make up the majority of the Top 10 in popularity, it is unlikely that the group has a statistically significantly higher pick rate compared to the other Roles.

Which Characters are the 'best'?¶

Let's take a look at characters by KDA.

In [11]:
# Top 10 KDA
print('Top 10 Characters by KDA')
df.nlargest(10, 'KDA')
Top 10 Characters by KDA
Out[11]:
Games played KDA Win Rate Pick Rate CS Gold Class Sub-Class Role Difficulty
Character
Yuumi 1159557 8.65 0.5080 0.0496 6.13 12080 Controller Enchanter Support Easy
Milio 746275 6.18 0.5248 0.0319 17.38 11906 Controller Enchanter Support Easy
Janna 1569905 5.50 0.5356 0.0671 21.77 12112 Controller Enchanter Support Easy
Nami 1161044 5.49 0.5104 0.0496 13.11 12234 Controller Enchanter Support Moderate
Soraka 1558708 5.12 0.4949 0.0666 10.72 11635 Controller Enchanter Support Easy
Lux 2852818 4.90 0.5256 0.1220 50.79 13907 Mage Artillery Mid Easy
Sona 1374954 4.83 0.5113 0.0588 17.72 12215 Controller Enchanter Support Easy
Ivern 490344 4.75 0.5207 0.0210 12.50 11688 Controller Catcher Jungle Hard
Xerath 1874995 4.65 0.5057 0.0802 42.14 13924 Mage Artillery Mid Moderate
Lulu 1470803 4.65 0.4779 0.0629 19.75 11923 Controller Enchanter Support Moderate

Interestingly within the Top 10 Characters by KDA, 9 characters are of the 'Support' role, where 7 of them are 'Enchanters'!

Previously when we filtered the Top 10 by popularity, the vast majority of characters were ADCs, yet we don't see any when we filter by KDA.

Remember that KDA is a measure of both kills and assists per death. Because the skill set of enchanters is to enhance the abilities of teammates or hinder opponents, it can be inferred that the majority of their KDA score comes from assists. Unfortunately, the original dataset does not provide a statistic that differentiates between 'kills' and 'assists'.

What if we want to get kills instead of just assisting other players? Which characters would be the most fun in this sense? We can filter the dataset to exclude the 'Enchanters' Sub-Class. Let's take a look at what the Top 10 would look like.

In [12]:
# Top 10 KDA excluding Enchanters
print('Top 10 characters by KDA excluding the Enchanter Sub-Class')
df[df['Sub-Class'] != 'Enchanter'].nlargest(10, 'KDA')
Top 10 characters by KDA excluding the Enchanter Sub-Class
Out[12]:
Games played KDA Win Rate Pick Rate CS Gold Class Sub-Class Role Difficulty
Character
Lux 2852818 4.90 0.5256 0.1220 50.79 13907 Mage Artillery Mid Easy
Ivern 490344 4.75 0.5207 0.0210 12.50 11688 Controller Catcher Jungle Hard
Xerath 1874995 4.65 0.5057 0.0802 42.14 13924 Mage Artillery Mid Moderate
Zilean 1265143 4.59 0.4828 0.0541 47.92 13383 Specialist Specialist Support Moderate
Orianna 1413430 4.45 0.5233 0.0604 49.53 13967 Mage Burst Mid Moderate
Nidalee 2548146 4.40 0.4904 0.1089 25.66 14204 Specialist Specialist Jungle Moderate
Seraphine 1465766 4.40 0.5162 0.0627 46.67 13137 Mage Burst ADC Easy
Morgana 2639049 4.36 0.5151 0.1128 37.39 13435 Controller Catcher Support Easy
Jhin 2907554 4.28 0.5155 0.1243 64.61 15073 Marksman Catcher ADC Moderate
Karma 2083571 4.28 0.5049 0.0891 29.73 13111 Mage Burst Support Easy

In this filtered KDA list, only 3 supports are in the Top 10! 2 ADC characters are now on the list.

Distributions¶

Let's look at some distributions for KDA, win rate, gold, and CS.

In [13]:
# Distributions
fig, axs = plt.subplots(ncols=4, figsize=(12, 4))

sns.histplot(data=df, ax=axs[0],
             x='KDA')
axs[0].set_title('KDA')

sns.histplot(data=df, ax=axs[1],
             x='Win Rate', color='green')
axs[1].set_title('Win Rate')

sns.histplot(data=df, ax=axs[2],
             x='Gold', color='gold')
axs[2].set_title('Gold')

sns.histplot(data=df, ax=axs[3],
             x='CS', color='salmon')
axs[3].set_title('CS')

fig.suptitle('Distributions')
plt.tight_layout()
plt.show()

Gold, CS, and KDA resemble a normal distribution, though KDA is slightly positively skewed with some outliers.

Win rate is negatively skewed.

KDA¶

From the Top 10 KDA list, we saw that characters of the Support role dominated. Do only a handful of Support characters skew the data or is the group itself simply better? If Support characters do have higher KDAs, meaning more kills and assists per death, does this mean that winning is more likely if Support characters are on our team? Let's analyze what kind or group of characters fare well in ARAM.

In [14]:
# Group by Role
fig, axs = plt.subplots(ncols=2, figsize=(10, 5), sharey=True)
sns.boxplot(data=df,
            x='Role', y='KDA',
            ax=axs[0], order=role_xlabel)
axs[0].set_title('KDA by Role')

# Looking at Support Role
df_support = df[df['Role'] == 'Support']

sns.boxplot(data=df_support,
           x='Sub-Class', y='KDA', ax=axs[1],
           order=['Catcher', 'Burst', 'Enchanter', 'Vanguard'])
axs[1].set_title('KDA by Support Sub-Classes')

plt.tight_layout()
plt.show()

Support role characters has the highest median KDA that is above the interquartile ranges (IQR) of all other Roles. If we split the Support group by their specific abilities types and skill sets (Sub-Class), we see that the Enchanter sub-class has a significantly greater KDA than the other Support sub-classes. Lets see the Enchanter sub-class compared to all the other sub-classes.

In [15]:
# KDA by Sub-Class
sns.catplot(data=df, kind='box',
           x='Sub-Class', y='KDA',
           aspect=2.8)
plt.title('KDA by Character Sub-Classes')
plt.show()

Enchanter sub-class has a high KDA compared to all other Sub-Classes!

Win Rate¶

We previously explored KDA, which is the amount of kills and assists per death. Now let's take a look at Win Rate to see if they mirror KDA.

In [16]:
fig, axs = plt.subplots(ncols=2, figsize=(16, 5), gridspec_kw={'width_ratios': [1, 3]}, sharey=True)
display(df.nlargest(10, 'Win Rate'))
# Win Rate by Role
sns.boxplot(data=df,
            x='Role', y='Win Rate',
            ax=axs[0], order=role_xlabel)
axs[0].set_title('Win Rate by Role')

# Win Rate by Sub-Class
sns.boxplot(data=df,
            x='Sub-Class', y='Win Rate',
            ax=axs[1])
axs[1].set_title('Win Rate by Sub-Class')

plt.tight_layout()
plt.show()
Games played KDA Win Rate Pick Rate CS Gold Class Sub-Class Role Difficulty
Character
Janna 1569905 5.50 0.5356 0.0671 21.77 12112 Controller Enchanter Support Easy
Lillia 1435605 3.20 0.5323 0.0614 45.40 13929 Slayer Skirmisher Jungle Hard
Galio 951369 3.29 0.5319 0.0407 32.60 12816 Tank Warden Mid Moderate
Malzahar 1258710 3.46 0.5311 0.0538 85.94 14057 Mage Battlemage Mid Easy
Lissandra 1176762 3.05 0.5287 0.0503 57.45 13867 Mage Burst Mid Moderate
Skarner 303198 3.12 0.5265 0.0130 26.59 12896 Fighter Diver Jungle Easy
Sion 1037683 2.50 0.5261 0.0444 46.28 13832 Tank Vanguard Top Easy
Lux 2852818 4.90 0.5256 0.1220 50.79 13907 Mage Artillery Mid Easy
Milio 746275 6.18 0.5248 0.0319 17.38 11906 Controller Enchanter Support Easy
Kog'Maw 1521893 3.30 0.5238 0.0651 63.44 15192 Marksman Marksmen ADC Moderate

Even though the Support role and Enchanter sub-class had the greatest KDA, this doesn't seem to translate over to Win Rate. The Support role, while still the highest median Win Rate, is now within the IQR of other Roles. The Enchanter sub-class is now looking quite average compared to other sub-classes, with the exception of Assassins which have a rather low win rate. Solely based on the KDA and Win Rate pattern for Support and Enchanter classifications, it seems like there is a low correlation between KDA and Win Rate.

We can confirm this by running some correlations.

Overall, win rate looks relatively similar across character sub-classes, except for Marksmen and Assassins. These two sub-classes have lower median win rates.

Correlations¶

Are Gold and CS correlated? How about Gold and KDA? Gold and Win Rate?

In [17]:
# Correlation matrix
correlation_matrix = df.corr(numeric_only=True).round(3)
print('Correlation Matrix')
display(correlation_matrix)

# Gold and CS correlation
fig, axs = plt.subplots(ncols=2, nrows=2, figsize=(10, 10))

sns.scatterplot(data=df, ax=axs[0, 0],
            x='Gold', y='CS')
axs[0, 0].set_title('Gold & CS')

sns.scatterplot(data=df, ax=axs[0, 1],
            x='Gold', y='KDA')
axs[0, 1].set_title('Gold & KDA')

sns.scatterplot(data=df, ax=axs[1, 0],
            x='Gold', y='Win Rate')
axs[1, 0].set_title('Gold & Win Rate')

sns.scatterplot(data=df, ax=axs[1, 1],
            x='Win Rate', y='KDA')
axs[1, 1].set_title('KDA & Win Rate')

fig.suptitle('Correlation Figures')
plt.tight_layout()
plt.show()
Correlation Matrix
Games played KDA Win Rate Pick Rate CS Gold
Games played 1.000 0.184 0.059 0.999 0.315 0.327
KDA 0.184 1.000 0.291 0.181 -0.227 -0.391
Win Rate 0.059 0.291 1.000 0.046 0.035 -0.265
Pick Rate 0.999 0.181 0.046 1.000 0.314 0.325
CS 0.315 -0.227 0.035 0.314 1.000 0.700
Gold 0.327 -0.391 -0.265 0.325 0.700 1.000

There is a strong correlation (0.700) between CS and gold. This is expected as CS-ing (dealing the last blow to a minion) actively rewards you with gold.

There seems to be a weak negative correlation between Gold and KDA (-0.391), as well as Gold and Win Rate (-0.265). This is unexpected as in theory, gold is used to buy items, which increases the strength and potency of your character, hypothetically leading to more kills/assists.

There is a weak correlation between KDA and Win Rate (0.291), which was expected when we analyzed the Support role and Enchanter sub-class in the KDA and Win Rate values.

Regional Differences¶

Let's import data from different regions and explore any potential inter-regional differences.

In [18]:
# import data
df_regional = pd.concat([pd.read_csv(f).assign(Region=f) for f in glob.glob('data/regions/*.csv')])

# clean up data
df_regional = clean_dataframe(df_regional)
df_regional['Region'] = df_regional['Region'].str.strip('data/regions/.csv')

# add columns
# add 'Class' column
df_regional = add_column_values(df_regional, 'Class', [
    'Tank' if champ in tanks else
    'Fighter' if champ in fighters else
    'Slayer' if champ in slayers else
    'Mage' if champ in mages else
    'Marksman' if champ in marksmen else
    'Controller' if champ in controllers else
    'Specialist' if champ in specialists else
    'Unknown'
    for champ in df_regional['Character']
])

# add 'Sub-Class' column
df_regional = add_column_values(df_regional, 'Sub-Class', [
    'Vanguard' if champ in vanguards else
    'Warden' if champ in wardens else
    'Diver' if champ in divers else
    'Juggernaut' if champ in juggernauts else
    'Assassin' if champ in assassins else
    'Skirmisher' if champ in skirmishers else
    'Artillery' if champ in artillery else
    'Battlemage' if champ in battlemages else
    'Burst' if champ in burst else
    'Catcher' if champ in catchers else
    'Enchanter' if champ in enchanters else
    'Marksmen' if champ in marksmen else
    'Specialist' if champ in specialists else
    'Unknown'
    for champ in df_regional['Character']
])
    
# add 'Role' column
df_regional = add_column_values(df_regional, 'Role', [
    'Top' if champ in top_laners else
    'Jungle' if champ in junglers else
    'Mid' if champ in mid_laners else
    'ADC' if champ in adcs else
    'Support' if champ in supports else
    'Unknown'
    for champ in df_regional['Character']
])

# add 'Difficulty' column
df_regional = add_column_values(df_regional, 'Difficulty', [
    'Easy' if champ in easy else
    'Moderate' if champ in moderate else
    'Hard' if champ in hard else
    'Unknown'
    for champ in df_regional['Character']
])

# set index
df_regional = df_regional.set_index('Character')

df_regional.sample(10)
Out[18]:
Games played KDA Win Rate Pick Rate CS Gold Region Class Sub-Class Role Difficulty
Character
Fiddlesticks 211207 2.93 0.5113 0.0650 38.45 13293 NA Specialist Specialist Jungle Moderate
Jinx 835050 3.40 0.5122 0.1233 74.96 14102 KR Marksman Marksmen ADC Moderate
Yorick 27,669 3.02 0.4969 0.0169 54.80 12,833 TW Fighter Juggernaut Top Moderate
Trundle 174994 3.23 0.4953 0.0258 29.30 12470 KR Fighter Juggernaut Jungle Easy
Ziggs 23733 4.32 0.5405 0.0658 53.69 13013 JP Mage Artillery Mid Moderate
Twisted Fate 322836 3.19 0.5206 0.0817 51.19 14354 EUW Mage Burst Mid Moderate
Zeri 23113 3.48 0.4987 0.0641 84.10 14179 JP Marksman Marksmen ADC Moderate
Rumble 19563 3.00 0.5201 0.0542 37.60 13230 JP Mage Battlemage Top Moderate
Skarner 69637 3.00 0.5165 0.0103 22.98 12331 KR Fighter Diver Jungle Easy
Ziggs 226140 4.06 0.5199 0.0696 55.41 13683 NA Mage Artillery Mid Moderate

Split by Region¶

In [19]:
# descriptive statistics
df_regional.groupby('Region').mean(numeric_only=True)
Out[19]:
KDA Win Rate Pick Rate CS
Region
EUW 3.241152 0.500450 0.060676 43.606788
JP 3.383879 0.497328 0.060688 39.192061
KR 3.285030 0.497601 0.060666 40.830061
NA 3.302364 0.499007 0.060685 41.836545
TW 3.366970 0.498215 0.060702 41.961515
In [20]:
fig, axs = plt.subplots(ncols=3, figsize=(15, 5))

sns.pointplot(data=df_regional, ax=axs[0],
              x='Role', y='KDA', hue='Region', 
              join=False, dodge=True, order=role_xlabel)
axs[0].set_title('KDA by Role')

sns.pointplot(data=df_regional, ax=axs[1],
              x='Role', y='Win Rate', hue='Region', 
              join=False, dodge=True, order=role_xlabel)
axs[1].set_title('Win Rate by Role')
axs[1].legend().set_visible(False)

sns.pointplot(data=df_regional, ax=axs[2],
              x='Difficulty', y='Pick Rate', hue='Region', 
              join=False, dodge=True)
axs[2].set_title('Pick Rate by Difficulty')
axs[2].legend().set_visible(False)

fig.suptitle('Comparisson Across Regions')
plt.tight_layout()            
plt.show()
In [21]:
sns.catplot(data=df_regional, kind='box',
            x='Sub-Class', y='Win Rate', hue='Region',
            aspect=2.8)
plt.title('Win Rate of Sub-Classes by Region')
plt.show()

From these point and box plots, there doesn't seem to be any statistically significantly differences in character statistics among regions.

Rank Differences¶

The game algorithm matches you with and against other players of similar skill level based on rank. Let's see if there are any differences in play-style among ranks.

In [22]:
# load in data
df_skilled = pd.read_csv('data/Diamond_ARAM_Global_September_Status.csv')

# clean
df_skilled = clean_dataframe(df_skilled)

# add columns
# add 'Class' column
df_skilled = add_column_values(df_skilled, 'Class', [
    'Tank' if champ in tanks else
    'Fighter' if champ in fighters else
    'Slayer' if champ in slayers else
    'Mage' if champ in mages else
    'Marksman' if champ in marksmen else
    'Controller' if champ in controllers else
    'Specialist' if champ in specialists else
    'Unknown'
    for champ in df_skilled['Character']
])

# add 'Sub-Class' column
df_skilled = add_column_values(df_skilled, 'Sub-Class', [
    'Vanguard' if champ in vanguards else
    'Warden' if champ in wardens else
    'Diver' if champ in divers else
    'Juggernaut' if champ in juggernauts else
    'Assassin' if champ in assassins else
    'Skirmisher' if champ in skirmishers else
    'Artillery' if champ in artillery else
    'Battlemage' if champ in battlemages else
    'Burst' if champ in burst else
    'Catcher' if champ in catchers else
    'Enchanter' if champ in enchanters else
    'Marksmen' if champ in marksmen else
    'Specialist' if champ in specialists else
    'Unknown'
    for champ in df_skilled['Character']
])
    
# add 'Role' column
df_skilled = add_column_values(df_skilled, 'Role', [
    'Top' if champ in top_laners else
    'Jungle' if champ in junglers else
    'Mid' if champ in mid_laners else
    'ADC' if champ in adcs else
    'Support' if champ in supports else
    'Unknown'
    for champ in df_skilled['Character']
])

# add 'Difficulty' column
df_skilled = add_column_values(df_skilled, 'Difficulty', [
    'Easy' if champ in easy else
    'Moderate' if champ in moderate else
    'Hard' if champ in hard else
    'Unknown'
    for champ in df_skilled['Character']
])

# set index
df_skilled = df_skilled.set_index('Character')

print('Load in data from High Ranks (skilled players)')
df_skilled.sample(5)
Load in data from High Ranks (skilled players)
Out[22]:
Games played KDA Win Rate Pick Rate CS Gold Class Sub-Class Role Difficulty
Character
Thresh 907 2.78 0.5149 0.1300 33.10 7240 Controller Catcher Support Hard
Hecarim 668 2.95 0.4865 0.0957 186.37 11506 Fighter Diver Jungle Moderate
Diana 399 2.18 0.4712 0.0572 173.77 11238 Fighter Diver Jungle Easy
Ivern 98 3.54 0.4694 0.0140 117.64 8877 Controller Catcher Jungle Hard
Heimerdinger 401 2.02 0.5162 0.0575 101.99 9217 Specialist Specialist Mid Moderate
In [23]:
x_label = ['Easy', 'Moderate', 'Difficult']
fig, axs = plt.subplots(ncols=3, figsize=[14, 5], sharey=True)

# All Ranks Pick Rate by Difficulty
sns.set_style('white')
sns.pointplot(data=df, x='Difficulty', 
            y='Pick Rate', hue='Difficulty', 
            join=False, ax=axs[0])
sns.stripplot(data=df, x='Difficulty', 
              y='Pick Rate', hue='Difficulty', ax=axs[0], alpha=0.3)
axs[0].set_title('All Ranks')
axs[0].set_xticklabels(x_label)
axs[0].legend().set_visible(False)

# High Ranks Pick Rate by Difficulty
sns.pointplot(data=df_skilled, x='Difficulty', 
            y='Pick Rate', hue='Difficulty', 
            join=False, ax=axs[1])
sns.stripplot(data=df_skilled, x='Difficulty', 
              y='Pick Rate', hue='Difficulty', 
              ax=axs[1], alpha=0.3)
axs[1].set_title('Diamond and Master')
axs[1].set_xticklabels(x_label)
axs[1].legend().set_visible(False)

# High Ranks Pick Rate by Role 
sns.pointplot(data=df, x='Role', ax=axs[2],
            y='Pick Rate', order=role_xlabel, label='All Ranks', 
             dodge=True)
axs[2].set_title('Pick Rate by Role')

sns.pointplot(data=df_skilled, x='Role', ax=axs[2],
            y='Pick Rate', order=role_xlabel, label='High Ranks', 
              color='r', dodge=True)
axs[2].legend().set_visible(True)
fig.suptitle('Pick Rate across Character Difficulty for All and High Ranks')
plt.tight_layout()
plt.show()

When comparing pick rate across character difficulty and roles, the variance looks similar between all ranks and high ranks (diamond and master).

A major limitation of this analysis is the lack of a separate data set from lower ranks (i.e., bronze, silver, gold ranks). The only comparison in this analysis is the data set between high ranks and all ranks, in which data from the all ranks set would be diluted by high ranks. As a result, differences between high and low ranks will be harder to identify.

Champion Differences¶

In [24]:
# Top 10 by KDA
print('Descriptive Statistics for Characters played by High Rank Players')
display(df_skilled.describe())
print('Top 10 Characters by KDA in High Ranks')
display(df_skilled.nlargest(10, 'KDA'))

fig, axs = plt.subplots(ncols=3, figsize=(15, 5))

# KDA by Role
sns.pointplot(data=df,
            x='Role', y='KDA',
            ax=axs[0], order=role_xlabel, 
             label='All Ranks')
axs[0].set_title('KDA by Role')

sns.pointplot(data=df_skilled,
            x='Role', y='KDA',
            ax=axs[0], order=role_xlabel,
             color='r', label='High Ranks')
axs[0].legend().set_visible(True)

# Win Rate by Role
sns.pointplot(data=df,
            x='Role', y='Win Rate',
            ax=axs[1], order=role_xlabel)
axs[1].set_title('Win Rate by Role')

sns.pointplot(data=df_skilled,
            x='Role', y='Win Rate',
            ax=axs[1], order=role_xlabel, 
             color='r', dodge=True)

# Win Rate by Support sub-classes
df_skilled_support = df_skilled[df_skilled['Role'] == 'Support']

sns.pointplot(data=df_support,
           x='Sub-Class', y='Win Rate', ax=axs[2],
           order=['Catcher', 'Burst', 'Enchanter', 'Vanguard'])
axs[2].set_title('KDA by Support Sub-Classes')

sns.pointplot(data=df_skilled_support,
           x='Sub-Class', y='Win Rate', ax=axs[2],
           order=['Catcher', 'Burst', 'Enchanter', 'Vanguard'], 
             color='r')
axs[2].set_title('Win Rate by Support Sub-Classes')

plt.tight_layout()
plt.show()
Descriptive Statistics for Characters played by High Rank Players
Games played KDA Win Rate Pick Rate CS Gold
count 165.000000 165.000000 165.000000 165.000000 165.000000 165.000000
mean 422.848485 2.278970 0.494021 0.060967 147.718424 10311.945455
std 308.020699 0.785015 0.031863 0.044433 56.609024 1539.632120
min 24.000000 0.300000 0.402300 0.004400 4.380000 6726.000000
25% 209.000000 2.130000 0.476200 0.030000 133.470000 9467.000000
50% 360.000000 2.470000 0.497800 0.051600 166.780000 10743.000000
75% 537.000000 2.750000 0.514900 0.077000 187.050000 11395.000000
max 2270.000000 4.450000 0.567600 0.325400 216.850000 13185.000000
Top 10 Characters by KDA in High Ranks
Games played KDA Win Rate Pick Rate CS Gold Class Sub-Class Role Difficulty
Character
Yuumi 447 4.45 0.4564 0.0641 4.38 6919 Controller Enchanter Support Easy
Ivern 98 3.54 0.4694 0.0140 117.64 8877 Controller Catcher Jungle Hard
Jarvan IV 795 3.50 0.4792 0.1139 144.32 10640 Fighter Diver Jungle Easy
Zac 490 3.49 0.5224 0.0702 131.60 9466 Tank Vanguard Jungle Easy
Shen 332 3.40 0.5000 0.0476 133.47 8833 Tank Warden Top Moderate
Evelynn 254 3.39 0.5236 0.0364 149.15 11204 Slayer Assassin Jungle Moderate
Lillia 406 3.29 0.5517 0.0582 176.70 11321 Slayer Skirmisher Jungle Hard
Seraphine 446 3.26 0.5202 0.0639 115.77 9150 Mage Burst ADC Easy
Zilean 257 3.25 0.4591 0.0368 53.92 7733 Specialist Specialist Support Moderate
Nunu & Willump 165 3.25 0.5212 0.0245 144.37 9899 Tank Vanguard Mid Easy

KDA is lower for all character roles at high ranks, whereas win rate is relatively similar with the exception of the Support role when comparing high ranks to all ranks.

Upon a closer inspection of the Support role, the Enchanter sub-class fares surprisingly poorly among the High Ranks for Win Rate. This is in contrast to Enchanter's high KDA and average win rate among All Ranks.

Correlations¶

In [25]:
print('Correlation at All Ranks')
display(correlation_matrix)
skilled_correlation = df_skilled.corr(numeric_only=True).round(3)
print('Correlations at High Ranks')
display(skilled_correlation)
Correlation at All Ranks
Games played KDA Win Rate Pick Rate CS Gold
Games played 1.000 0.184 0.059 0.999 0.315 0.327
KDA 0.184 1.000 0.291 0.181 -0.227 -0.391
Win Rate 0.059 0.291 1.000 0.046 0.035 -0.265
Pick Rate 0.999 0.181 0.046 1.000 0.314 0.325
CS 0.315 -0.227 0.035 0.314 1.000 0.700
Gold 0.327 -0.391 -0.265 0.325 0.700 1.000
Correlations at High Ranks
Games played KDA Win Rate Pick Rate CS Gold
Games played 1.000 0.075 0.102 0.997 0.030 0.068
KDA 0.075 1.000 0.138 0.074 -0.376 -0.285
Win Rate 0.102 0.138 1.000 0.098 0.167 0.236
Pick Rate 0.997 0.074 0.098 1.000 0.032 0.072
CS 0.030 -0.376 0.167 0.032 1.000 0.929
Gold 0.068 -0.285 0.236 0.072 0.929 1.000
In [26]:
# Correlations
fig, axs = plt.subplots(ncols=2, nrows=2, figsize=(10, 10))

sns.scatterplot(data=df, ax=axs[0, 0],
            x='Gold', y='CS', label='All Ranks')
axs[0, 0].set_title('Gold & CS')

sns.scatterplot(data=df_skilled, ax=axs[0, 0],
            x='Gold', y='CS', color='red',
               label='High Ranks')

sns.scatterplot(data=df, ax=axs[0, 1],
            x='Gold', y='KDA')
axs[0, 1].set_title('Gold & KDA')

sns.scatterplot(data=df_skilled, ax=axs[0, 1],
            x='Gold', y='KDA', color='red')

sns.scatterplot(data=df, ax=axs[1, 0],
            x='Gold', y='CS')
axs[1, 0].set_title('Gold & Win Rate')

sns.scatterplot(data=df_skilled, ax=axs[1, 0],
            x='Gold', y='CS', color='red')

sns.scatterplot(data=df, ax=axs[1, 1],
            x='Win Rate', y='KDA')
axs[1, 1].set_title('KDA & Win Rate')

sns.scatterplot(data=df_skilled, ax=axs[1, 1],
            x='Win Rate', y='KDA', c='r')

fig.suptitle('Correlations Differences by Ranks')
plt.tight_layout()
plt.show()

There is a very strong correlation between Gold and CS (0.929) at High Ranks and is greater than All Ranks (0.700).

There is a weak, but positive correlation between Gold and Win Rate (0.236) at High Ranks compared to the weak negative correlation for All Ranks (-0.265)

Differences between Win Rate and Gold, and Win Rate and KDA seem negligible.

Limitations¶

  • KDA as the only variable to describe kills, deaths, and assists. It would be very insightful if the data set were to include separate values for kills, deaths, and assists to better differentiate which characters are the 'best' based on average kills, deaths, and assists per game.
  • The only rank specific data set that exists is for High Ranks, while there lacks a data set that isolates for low ranks. This analysis was only able to compare between high ranks and all ranks, as such, differences between high and low ranks may not be observed. For a more accurate comparison between ranks, there should be a separate data set comprised of only data from low ranks.
In [ ]: